Infinit-O is the trusted, customer-centric, and sustainable leader in Business Process Optimization. We empower finance and healthcare organizations to thrive in a digital-first world by combining specialized industry expertise and innovative technology for 20 years. We navigate complex industry landscapes to drive transformative outcomes, helping businesses streamline operations, enhance customer experience, and achieve sustainable growth backed by a world-class Net Promoter Score of 75. Our approach combines operational efficiency with a human-centered ethos, ensuring sustainable value creation for our clients and team members.
As a Certified B Corporation, Infinit-O is committed to the highest standards of social and environmental performance, accountability, and transparency. We embed these values into every aspect of our operations—aligning business success with a positive impact on our clients, people, and communities.
Our commitment to Diversity, Equity, and Inclusion (DEI) is integral to our mission. We believe that building inclusive, equitable teams is not only the right thing to do—it is also essential for driving innovation and better business outcomes. We actively promote equal opportunity through inclusive hiring practices, continuous learning programs, and regular equity assessments to ensure a fair and empowering workplace for all.
Role Overview We are hiring an Associate DevOps Engineer (Platform Reliability & Operations) to own the reliability, release pipeline, environment lifecycle, and operational governance of our Microsoftcentric platform portfolio (Azure, Power Platform, Copilot Studio, Dataverse, Microsoft Fabric, etc.). This is a hands-on, mission-critical role that blends DevOps, cloud operations, incident response, platform administration, and continuous improvement. You will operate across Dev / UAT / PreProd / Prod environments, manage credentials and access at scale, implement secure CI/CD, and drive platform hardening and automation. The role is ideal for someone who loves troubleshooting, root-cause analysis, building robust fallback/rollback patterns, and helping teams ship safe, observable systems. This is a fresher level position with significant exposure to stakeholders, product teams, clients, and platform governance. Strong learning orientation and willingness to lead small R&D efforts are required.
Key Responsibilities
1. Platform Operations & Reliability
- Own day-to-day operational health of platform components (Azure services, Fabric, Power Platform, Dataverse, Copilot Studio).
- Monitor service health, manage alerts and SLOs, run capacity planning, and drive remediation for recurring issues.
- Implement observability (metrics, logs, traces), dashboards, and automated on-call playbooks.
- Perform root cause analysis (RCA) and lead post-incident reviews with actionable followups.
2. SDLC, Release & Pipeline Management
- Design, implement, and maintain CI/CD pipelines using Azure DevOps (or GitHub Actions where applicable), including build, test, security scans, and gated releases.
- Manage environment promotions (Dev → UAT → Pre-Prod → Prod), feature-flag strategies, blue/green and canary deployments, and rollback plans.
- Automate environment provisioning and configuration using IaC (ARM templates, Bicep, Terraform) and GitOps principles.
3. Security, Credentials & Access Management
- Own secrets and credential lifecycle using Azure Key Vault (or equivalent), rotate keys, and manage service principals and managed identities.
- Implement RBAC, conditional access, and least-privilege access models for internal teams and client tenants, coordinate with Azure AD, PIM, and security teams.
- Support compliance and audit needs (log retention, access trails) for SOC2/ISO/security assessments.
4. Platform Administration & Governance
- Administer Power Platform environments, Copilot Studio tenants, Dataverse tenancy objects, and Fabric admin centres.
- Manage tenant-level policies, data loss prevention (DLP) rules, environment provisioning guardrails, and licensing considerations.
- Define and enforce governance processes, onboarding/offboarding procedures, and tenant separation for client vs internal workloads.
5. Incident Management & Troubleshooting
- Act as escalation for complex incidents, perform deep diagnostics (application, infra, network, data pipelines), and coordinate cross-team remediation.
- Create and maintain runbooks, playbooks, and SOPs for common incidents and failover scenarios.
- Implement automated corrective actions where feasible (self-heal scripts, automated scaling, cleanup jobs).
6. Integrations & Platform Services
- Support integrations across API Management, Azure Functions, Logic Apps, Event Grid, Service Bus, Synapse / Fabric data flows, and third-party SaaS connectors.
- Collaborate with engineers to design resilient integration patterns and handle data-contract changes, schema migration, and bindings.
7. Quality, Automation & Tooling
- Drive automated testing in pipelines (unit, integration, smoke, regression) and include security checks (SAST, dependency scanning).
- Implement telemetry-based release gates and rollback triggers.
- Build utilities and small automation tools to reduce manual toil and speed troubleshooting.
8. Stakeholder & Client Management
- Serve as the platform contact for product teams and client administrators — coordinate releases, permissions, and tenant configuration for clients.
- Provide regular operational reports, risk assessments, and release notes to stakeholders.
- Mentor and enable application teams on platform best practices and platform self-service.
Required Skills & Experience
- Bachelor’s degree in Computer Science, IT, or related field — or equivalent practical experience.
- Hands-on experience with Azure platform services (App Services, AKS, Functions, Key Vault, Azure AD, Monitor, Resource Manager).
- Practical experience in Azure DevOps (pipelines, repos, artifacts) or GitHub Actions for CI/CD.
- Experience administering Power Platform, Dataverse, and familiarity with Copilot Studio / tenant admin concepts.
- Strong scripting and automation skills (Python, PowerShell, Bash).
- Familiarity with IaC (ARM/Bicep/Terraform) and deployment automation.
- Solid understanding of networking fundamentals, identity (Azure AD), RBAC, and secrets management.
- Experience with monitoring and observability tooling (Azure Monitor, Log Analytics, Application Insights, Prometheus/Grafana).
- Strong troubleshooting, RCA, and incident management capabilities.
- Good written and verbal communication; comfortable with client-facing coordination.
Preferred / Nice-to-Have
- Certifications: Microsoft Certified: Azure Administrator, Azure DevOps Engineer, Power Platform + Copilot Studio certifications.
- Experience with container orchestration (AKS), Kubernetes, and container CI/CD.
- Exposure to data platform operations (Microsoft Fabric, Synapse, Data Factory).
- Knowledge of security scanning tools, SAST/DAST, and compliance frameworks (SOC2/ISO27001).
- Experience with service-mesh, API Management, and event-driven architectures.
- Prior experience in multi-tenant SaaS operations or managing client-specific tenants/environments.
Professional Attributes & Mindset
- Proactive operator who reduces risk through automation and strong governance.
- Curious, research-oriented: experiments with improvements, small R&D PoCs, and pilots.
- Customer-focused — understands business impact and communicates clearly under pressure.
- Collaborative team player who can mentor and enable developers to use platform features safely.
- Ownership mentality: sees operational issues through to resolution and implements preventative measures.
Learning & Growth
- Regular time allocated for learning (new Azure services, Copilot Studio features, IaC patterns).
- Opportunities to lead R&D experiments, prototype new resiliency patterns, and influence enterprise platform strategy.
- Clear career paths into Senior Platform Engineer, SRE Lead, Platform Architect, or Cloud Center of Excellence roles.
Working Conditions & Expectations
- On-call rotations (with compensatory time-off) and shift-handling as required for 24×7 services.
- Willingness to work cross-geography hours for critical releases or incident response.
Why Join Us
- Hands-on ownership of cutting-edge Microsoft automation and AI tooling.
- Opportunity to shape platform governance and enterprise-grade release processes.
- Fast learning curve with broad exposure: DevOps, cloud, data, automation, and security.
- Collaborative culture that encourages innovation, experimentation, and continuous improvement.